Enhancing Big Data Feature Selection Using a Hybrid Correlation-Based Feature Selection

نویسندگان

چکیده

This study proposes an alternate data extraction method that combines three well-known feature selection methods for handling large and problematic datasets: the correlation-based (CFS), best first search (BFS), dominance-based rough set approach (DRSA) methods. aims to enhance classifier’s performance in decision analysis by eliminating uncorrelated inconsistent values. The proposed method, named CFS-DRSA, comprises several phases executed sequence, with main incorporating two crucial tasks. Data reduction is first, which implements a CFS BFS algorithm. Secondly, process applies DRSA generate optimized dataset. Therefore, this solve computational time complexity increase classification accuracy. Several datasets various characteristics volumes were used experimental evaluate method’s credibility. was validated using standard evaluation measures benchmarked other established such as deep learning (DL). Overall, work proved it could assist classifier returning significant result, accuracy rate of 82.1% neural network (NN) classifier, compared support vector machine (SVM), returned 66.5% 49.96% DL. one-way variance (ANOVA) statistical result indicates alternative tool those difficulties acquiring expensive big tools who are new field.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection in Structural Health Monitoring Big Data Using a Meta-Heuristic Optimization Algorithm

This paper focuses on the processing of structural health monitoring (SHM) big data. Extracted features of a  structure are reduced using an optimization algorithm to find a minimal subset of salient features by removing noisy, irrelevant and redundant data. The PSO-Harmony algorithm is introduced for feature selection to enhance the capability of the proposed method for processing the  measure...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

a hybrid feature subset selection algorithm for analysis of high correlation proteomic data

pathological changes within an organ can be reflected as proteomic patterns in biological fluids such as plasma, serum, and urine. the surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (seldi-tof ms) has been used to generate proteomic profiles from biological fluids. mass spectrometry yields redundant noisy data that the most data points are irrelevant features ...

متن کامل

H-BwoaSvm: A Hybrid Model for Classification and Feature Selection of Mammography Screening Behavior Data

Breast cancer is one of the most common cancer in the world. Early detection of cancers cause significantly reduce in morbidity rate and treatment costs. Mammography is a known effective diagnosis method of breast cancer. A way for mammography screening behavior identification is women's awareness evaluation for participating in mammography screening programs. Todays, intelligence systems could...

متن کامل

MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection

Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2021

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics10232984